RGloVe: An Improved Approach of Global Vectors for Distributional Entity Relation Representation
نویسندگان
چکیده
Most of the previous works on relation extraction between named entities are often limited to extracting the pre-defined types; which are inefficient for massive unlabeled text data. Recently; with the appearance of various distributional word representations; unsupervised methods for many natural language processing (NLP) tasks have been widely researched. In this paper; we focus on a new finding of unsupervised relation extraction; which is called distributional relation representation. Without requiring the pre-defined types; distributional relation representation aims to automatically learn entity vectors and further estimate semantic similarity between these entities. We choose global vectors (GloVe) as our original model to train entity vectors because of its excellent balance between local context and global statistics in the whole corpus. In order to train model more efficiently; we improve the traditional GloVe model by using cosine similarity between entity vectors to approximate the entity occurrences instead of dot product. Because cosine similarity can convert vector to unit vector; it is intuitively more reasonable and more easily converge to a local optimum. We call the improved model RGloVe. Experimental results on a massive corpus of Sina News show that our proposed model outperforms the traditional global vectors. Finally; a graph database of Neo4j is introduced to store these relationships between named entities. The most competitive advantage of Neo4j is that it provides a highly accessible way to query the direct and indirect relationships between entities.
منابع مشابه
The study of relation between existence of admissible vectors and amenability and compactness of a locally compact group
The existence of admissible vectors for a locally compact group is closely related to the group's profile. In the compact groups, according to Peter-weyl theorem, every irreducible representation has admissible vector. In this paper, the conditions under which the inverse of this case is being investigated has been investigated. Conditions such as views that are admissible and stable will get c...
متن کاملAn Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملFully unsupervised low-dimensional representation of adverse drug reaction events through distributional semantics
Electronic health records show great variability since the same concept is often expressed with different terms, either scientific latin forms, common or lay variants and even vernacular naming. Deep learning enables distributional representation of terms in a vector-space, and therefore, related terms tend to be close in the vector space. Accordingly, embedding words through these vectors open...
متن کاملنقشه سازی و مروری بر آنوفل های ناقل مالاریا در ایران
Introduction:Mapping distribution of endemic diseases with their relations to geographical factors has become important for public health experts, especially in the study of vector-born protozoan diseases with emphasis on spatial or geographical epidemiology. This study was carried out to provide distribution maps of the geographical pathology vectors of Malaria in Iran. Methods: A systemat...
متن کاملA text representation language for contextual and distributional processing
This thesis examines distributional and contextual aspects of linguistic processing in relation to traditional symbolic approaches. Distributional processing is more commonly associated with statistical methods, while an integrated representation of context spanning document and syntactic structure is lacking in current linguistic representations. This thesis addresses both issues through a nov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Algorithms
دوره 10 شماره
صفحات -
تاریخ انتشار 2017